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MBTBQD AHD SY8VBH FOR BLmnHATIBO SCTGBRONZZATXON BBTWBBN 
SmBP AND AIiLOCA9B XN A OCdKmSSST QABBAffiR COLXfinOR 

This invention relates to garbage oollection for canpater neniory 
management and, in particular, to a concurrent garbage collection 
algorithm. 

Many of the prior art techniquee mentioned in the next section are 
discussed in greater detail in the following publications s 

[11 Edsgar w. oijkstra, Leslie Lamport, A.j. Scholten, B. F. 
Scholten, B.P. Steffens, On-the-fly Garbage Collection: An Exercise in 
Cooperation, CGomunicatians of the ACM > NOveinber, 1978. 

{2] Paul HudaX, Robert M* Keller, Garbage Collection and Task 
Deletion in Distributed Systens, AQf Symposium on Lisp and Functional 
Programming, PP* 168-178, Pittsburgh, PA, August 1982. 

[3] Damien Doligez, xavier Leroy# A concurr^t generational garbage 
collector for a multithreaded implementation of ML, Proc. 20th Symp. 
Principles of Programming Languages, 1993, pp. 113-123. 

[41. Damien Doligez, Georges Gonthier, Portable Ondbtrusive Garbage 
Collection for Multi -Processor Systems, Conference Record of the 
Twenty- first Annual ACM Symposium on Principles of Programming Languages, 
January, 1994. 

. (51 Leslie Lan^rt, Garbage Collection with Multiple Processes: An 
Exercise in Parallelism, 1978. 

[6] Leslie Lamport, How to Make a Multiprocessor Cooiputer that 
Correctly Executes Multiprocess Programs, lEEB Transactions on Computers, 
C-28(9):690-691, Septenber 1979. 

Within the ccmtext of coa^mter memory management, gari>age 
collection relates to the automatic reclamation of computer storage. When 
data c^jects such as arrays, records and other data structures are 
created, space for the object is allocated in the heap. The term "object" 
is used herein to denote generally any piece of m^nory. When the object 
is no longer needed, its space moat be freed in ord^ that the heap does 
not become saturated with objects that are no longer required for the 
computation. Computer programming languages such as Pascal or C, 
typically require the programmer to attend to reclamation of heap storage 
manually. The programmer must keep track of information that allows him 
to determine when an object can be safely discarded. This manual heap 
maintenance is feasible, although prone to errors. 



The ooatinuing need to avoid such errors has rendered systems and 
languages supporting garbage collected heaps very attractive. Developing 
software in such environments is much faster because garbage collection 
eliminates a large class of programmer errors, both in the design and 
ijflpleaientatian stages. Furthermore, in programming languages such as Java 
from Sun Microsyst^os, ndiich is emerging as a standard Intemet tool and 
a platform- independent implementation vehicle, there is no e3q>licit 
de -allocation by the programmer and therefore use of these languages 
mandates a good garbage collection algorithm. 

a!he garbage collector's tasX is to locate data objects that are no 
longer required, and to reclaim their space in memory for use by the 
running program, in marX-sweep garbage collectors, garbage collection is 
implemented in two successive stages, m a first stage, the object graph 
described by the interrelation of objects starting from the roots and 
traversing all connected objects in the heap, is traced so as to identify 
live dbjects. An object is considered live if it is reachable either 
directly from the roots or from some other live object, toy otbdr dbject 
is ccmsidered garbage and can be collected. Tlie roots include global 
state (e.g. global variables) and the local state of each thread (e.g. 
the thread's stack and its local variables on that stack) . l>he live 
objects are marked in some way so as to distinguish between live Objects 
and gart>age. In a second stage, the memory is sw^t, all the memory space 
occupied by unmarJced objects (garbage) is reclaimed and the marked 
objects are unmarked, in preparation for the next gax^age collection 
cycle. 

in so-called -concurrent" garbage collectors, the execution of the 
program ^ich updates and changes the object graph is concurrent with the 
marking and sweeping operations carried out by the collector. Whilst this 
avoids processor inactivity during garbage collection, the running 
program may change the object graph during the very act of tracing out 
reachable data objects by the collector. For this reason, the running 
program is referred to as the mutator since it mutates or changes the 
object graph. As a result, there exists the risk that the collector may 
miss marking a live object and the live object may then be subsequently 
reclaimed by the collector. In order to avoid this possibility, 
synchronization between the mutator and collector threads is essential . 

An important consideration with regard to concurrent collectors is 
their degree of conservatism with respect to changes made by the mutator 
during garbage collection.. Thus, an object may have been marked as live 
by the garbage collector and subsequently made unreachable by the 
mutator. Such an object constitutes floating garbage which is not 
reclaimed during the current garbage collection cycle. It will, however. 



be collected during the next cycle since it will be identified as garbage 
at the beginning of the next collection. 

Floating garbage clogs up the heap unnecessarily and thus is 
undesirable. Whilst a certain anount of floating garbage may be tolerated 
and, indeed, is inevitable since no garbage collector can be completely 
efficient, the reverse can under no circumstances be tolerated. That is 
to say, reachable objects must never be marked as unreachable by the 
tracer since their space would then be erroneously collected, causing 
possibly catastrophic effects on the application program. Hiis asymmetry 
inclines garbage collectors towards being naturally conservative since it 
always better not to reclaim garbage than to reclaim it erroneously* This 
oonservati^ impacts on the manner in which conflicts between mutator 
allocation and garbage collector sweep are resolved* 

. The question arises as to how to mark an object newly allocated by 
the mutator, especially during the sweep phase of garbage collection, 
^ich collects unmarked objects and resets the mark of marked objects. 
During the sweep phase, an object ^ich is- allocated in those locations 
of the heap that have not yet been swept in the current sweep cycle, must 
be allocated as marked, so that the swe^ will not collect them. Objects 
which are allocated • in an area nAiich has already been swept must be 
allocated as unmarked in order that they 'he unmarked for the start of the 
next collection. This requires synchronization, be it implicit or 
e^licit, between the sweep process and the allocation procedure, lest an 
object, be subsequently reclaimed ^ilst still alive. 

A sub-class of concurrent garbage collectors are so-called "on the 
fly" garbage collectors first introduced by Dijkstra et al. [1). In this 
type of garbage collector, the manner in which reachable objects are 
marked is by assigning a different color attribute to distinguish between 
rea ch a b le and unreachable objects, ihis approach has been adopted in both 
concurrent and "on the fly" garbage collectors, a four -color marking 
conventionally being used. A "«diite" color indicates that an object is 
unmarked. A "gray" color indicates that an object is marked, but that its 
direct descendants, may not yet be marked (i.e. some may be white) . A 
"black" color indicates that an object is marked and that all its direct 
descendants are marked (either gray or black) . Finally, a "blue" color 
indicates that the object is free. Use of a fourth color to distinguish 
free objects avoids the need for the garbage collector to trace these 
objects, and thus saves time. In such a scheme, "gray" or "black" objects 
are also referred to as "shaded" cA>jects. At the start of the cycle all 
objects are i^ite. During tracing, the color of live objects progresses 
from ^ite to gray to black. After tracing, the collector then sweeps: 
white objects are colored blue and appended to the free list; shaded 



objects are changed to wtiite in preparation for the next collection 
cycle* 



TtB advantage of "on the f ly» garbage collectors resides in that 
there is no synch r o n ization point where the mutator threads have to stop. 
This obviates the need for explicit locking i^ich might otherwise lock 
out the mutator and collector threads in order to force synchronization 
between them. However, as will be seen, this does not itself preclude 
io^licit synchronization ^^reby the order of operations as performed by 
a thread in a multiprocessor system is significant and must be the same 
order perceived by other threads. That is to say, given the absence of 
explicit synchronization between collector and mutator threads, ^Aat is 
referred to as "strong" or "secjuential" ocnsistmcy may be required for 
correctness of the collection algorithm. As defined by Lan^rt[6] a 
multiprocessor system is sequentially consistent if the result of any 
execution is the same as if all of the processors were executed in some 
sequential order, and the operations of each individual processor appear 
in this sequence in the order specified by the program. An analogous 
definition for sequential consistency of a multi- threaded or 
nulti -process execution holds. 

There are two requirements for sequential consistency. First, 
program order must be maintained among operations from a single processor 
thread, and secondly a single sequential order must be maintained among 
all operations. For r e aro ns of performance, modem multiprocessors do not 
guarantee sequential consistency; rather they provide a more relaxed form 
of consistency. In the abs&ice of sequential consistency in a 
multiprocessor system, special steps must be taken in order to ensure 
that when a new object is allocated during the swe^ stage of the 
collector, it will be marked the appropriate color. This will now be 
esq^lained in greater detail with particular regard to the Doligez and 
Gonthier collector [4] . 

Wh«i a mutator allocates a new object, i.e. removes it from the 
free list and starts using it, it must assign the proper color to the new 
bbject. The proQBr color d^>ends on the stage of the collection cycle 
currently being executed by the collector thread. While no garbage 
collection is taking place and at the start of the collection cycle the 
proper color is white. At some point during the mark/ trace phase, the 
proper color becomes black (the point depends on the specific collection 
algorithm). During sweep, the proper color is black- if the dbject is in 
an area of the heap that has not yet been swept and white if the object • 
has already been swept. Choosing the proper color during sweep requires' 
synchronization between the mutator thread allocating the object and the 
collector thread. This synchronization may be implicit and depend on the 



ordering o£ read and write operations as in the collector described by 
Doligez and Gonthier[4] . 

13ie Doligez and ^mtliier collector is a descendent of the Dijkstra 
collector and is described in pseudo-code. Mutator threads perform 
actions including the coloring of newly created objects in cooperation 
with the collector. Exactly what actions they need to perfoxm are 
determined by Uhere the collector thread is in the collection cycle. Oto 
facilitate this cooperation, each mutator thread has a status field 
co nnec ted with it %diich takes one of three values: Syncl, Sync2, Async. 
Ihe collector calls for mutators to change their status three times per 
collection cycle, the mutators change status in a circular fashion, 
progressing frc» Async to Syncl to Sync2 and back to Async. When the 
collector reaches a certain point in its cycle, it re<xuests that all the 
mutators take on the succeeding state, ihese requests are known as 
h an d sha ke actions. For exas^le Handshake (Async) signifies that the 
collector is requesting all mutators to change their state £xtm Sync2 to 
Async. 

Hhe Doligez and Gonthier oollectm calls for the mutators to 
escecute a create protocol every time an object, x, is allocated by a 
mutator, m. ihe purpose of the protocol is to choose a color for the 
newly created object. It is assumed that a mutator does not respond to a 
•handshakft action, i.e., change its collection status during the execution 
of the create protocol: 

color (xl » Black; 

If (status [a] ^ Async or x < swept) 

color [x] ^ White; 
else if (x ^ swept) 

color [x] » Gray; 

Checking the conditions in the create protocol involves accessing a 
global variable, sw^t, which must be reloaded from memory on each 
access. The value of swept r^resents the collector's progress in 
swe^ing the heap. While the collector is not sweeping, the glc^l 
variable sw^t is set to scmie value guaranteed to be larger t'^^n the 
value of any address in the heap. Just before Mark/Trace, the collector 
resets this value to less than the lowest address in the heap. During 
swe^ing this value is gradually incremented as the collector processes 
the elements in the heap. Its value represents the address of the object 
currently being swept. 

Execution. of the create protocol is important: if a newly created 
object is colored White at the wrong time it will be incorrectly 
collected. If it is colored Black, this in^lies that its immediate 



descendants have been marked. rOierefore, coloring Black at tbe wrong 
time, i.e. before the immediate descendants are marked may result in the 
descendants being incorrectly collected. It is always safe to color Gray, 
but inefficient: if an object is Gray neither it nor its descendants 
be collected. This contradicts the prime goal of the collector, namely to 
free unused memory. 

Sweeping in the Doligez and Gonthier collector is done by the 
following pseudo-code: 

swept « 0; • . 

idiile (sw^t < endLPf Jieap) do 

if (color [sw^t] — Black or color [sw^t] «= Gray) 

ralor [swept] - White; 
else if (color [sweptl White) 
color {swept] » Blue; 
appendJto^ree^ist (swept) ; 
swept e swept 1; 
swept = -i-inf inity; 

Synchronization between object allocation and swe^ is implicit and 
complex to understand. It also depends on the allocating mutator thread 
reading an up-to-date value of the variable sw^t. Gn multiprocessor 
architectures that do not guarantee sequential consistency (e.g» the 
PowerPC), sweep may require a synchronizing instruction (e.-g. sync on- 
PowerPC) after incrementing the variable swept , and object allocation may 
require a synchronizing instruction before reading the value of the 
variable swept. ^Siese synchronizing instructions are multi -cycle 
instructions and may require memory access; thus they are quite 
e3q>ensive. 

Hudak and Keller [2] descri]>e a collector for an esoteric 
distributed applicative processing system (D^S) model. In this model 
there is no shared memory between processors. Thus, consider a standard 
stack- implementation of the mark phase of a conventional collector in 
shared memory. Bach root is marked and pushed on to the stack. Nodes are 
then repetitively removed from the stack in order to examine each of 
their descendants in the object graj^. if a descendant is already marked, 
no further action is required; otherwise, it is also marked and pushed on 
to the stack. Aus, the stack serves as a place-holder for nodes that 
have been marked but wliose descendants have not yet been examined, 
m^lementing a stack for DAPS would impose a very high synchronization 
overhead. In place of the stack, Hudak and Keller employ a marking tree • 
of taslcs. The marking tree reflects the parallel nature of distributed 
marJcing in a manner analogous to the linear stack reflecting the nature ' 
of sequential marlcing. alius, whilst a sequential mutator adds nodes to a 
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Stacks so their distributed collector starts a new task and adds it as a 
branch in the marking tree. 

In order to avoid the synchronization between object allocation and 
sweep, Hudak and Keller further propose switching the meaning of the 
black and wbite colors on successive collection cycles • in saying this, 
it is to be noted that Hudak and Keller themselves acknowledge that the 
term "color" has a different interpretation for their distributed system 
than for conventional shared data structures. In particular, their 
definition of "color" is related to their marking tree data structure. 

The sweep phase in the garbage collector disclosed by Hudak and 
Keller comprises three separate phases. At the end of marking, white 
nodes are garbage, and all tasks pointing to white nodes are irrelevant. 
The sweep idiase first terminates irrelevant tasks, then collects all 
%diite nodes by adding them to the free-list, and then prepares the systen 
for the next collector cycle. In practice, adding ^ite nodes to the free 
list requires that they first be "bleached" since nodes on the free-list 
have no color in the BUdak and Keller collector. Trace is finished vtea 
there are no gray nodes left and therefore at end of trace all nodes 
which sire r ea chable are black. There can also be white and bleached 
nodes. This, incidentally, is distinct from the Doligez and Gonthier 
collector mentioned above, lAiere there cdn be gray nodes. Doligez and 
Gonthier do not invest the effort to prevent this condition since their ^ 
collector works correctly on the assumption that all reachable nodes are 
shaded and point to other nodes which are shaded. - - 

Thus, at the start of the sweep in the HUdak Keller collector, 
there can be no gray nodes. The Question %^ch r^nains, therefore, is 
what to do with the black xiodes. It is inadmissible merely to paint them 
nAiite in preparation for the next mark phase, since if this were done at 
the same time as the sweeping process is reclaiming «^ite nodes, live 
nodes would be freed with fatal consequences. Therefore, Hudak and Keller 
simply ignore black nodes until the sweep is conq>lete, whereafter the 
mutator is instructed to reverse its sense of black and white. That is, 
«dien the sweep phase is coinplete, the mutator sees only black nodes. If 
now, it interprets thm as being white, then the mark phase is ready to 
begin. 

The implementation of this approach by Hudak and Keller is 
intimately bound up with the parallel processing afforded by the 
distributed nature of their mutators since, in effect, there exist many 
processing elements each acting independently. When one processing 
element changes its sense of color, it views all nodes in the system as 
being white, even though some other processing element may view the same 
nodes as being black.. As long as they are all either v^ite or black, the 
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mutators behave the same. It is only after all processing elements have 
"reversed colors" that the next mark phase is allowed to commence* 

It is further to be noted that Hudak and Keller do reguire locking 
when updating a node by a program thread in order to prevent other 
processors from updating the same node. In this ccmnection, particular 
reference should be paid to their two coopl^aentary tasks add-ref and 
e3a>and-node. Add-ref selectively adds an arc to the marking tree and is 
used to spawn a new node in the object gra«ih during tracing. Bxpand-node 
allows a program thread to add a new subgra^ to a selected node, in both 
cases, a c±iild« or descendant node, may be selected only «^ien the mutator 
threads are locked against accessing the memory address o£ the parent 
node. Moreover, the color id&ich is assigned by expand-node to a child 
node, depends on the node's hierarchy in the object graph. Thus, the 
color of the parent node must first be checked. If it is Black then the 
child node is also set to Black whilst otherwise it is set to white. 

•Che need always to check the color of the parent before assigning a 
color to a newly allocated object couqpled with the need for explicit 
loc k ing constitute a major overhead ^ich degrades the performance of the 
garbage collector. 

It is thus i^^parent that the color ' reversal proposed the Hudak and 
Keller collector is very specific to their DAPS model and is by no means 
immediately applicable to other garbage collectors. This is borne out by 
the fact that Hudak and Keller published their marlcing-tree collector in 
1982 and since that time no attempt has been made to try to apply their 
techniciues to other concurrent garbage collectors. 

Finally, mention is made of Lan«)ort[5] who also describes a 
mechanism for changing the meaning of colors for a concurrent and 
on- the -fly collector. He proposes his me€*hanism in order to pipeline the 
collection algorithm, so that the trace of new collection cycle can work 
in parallel with the sweep of the previous collection cycle. His 
algorithm does not have the race between allocate and sweep because he 
bases his algorithm on Dijkstra's original 3 color scheme. 

It is a principal objective of the invention to eliminate 
synchronization between sweep and allocate in respect of a newly created 
object in a concurrent garbage collector for a heap implemented in shared 
monoiry having mark and sweep phases. 

A further objective of the invention is. to avoid the need in such a 
garbage collector to calculate which color must be assigned by a mutator 
to a newly allocated object every time a new object is allocated, thereby 
speeding the color determination and subsequent garbage collection. 
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These objectives are realized in accordance with a first aspect of the 
invention by a method according to claim 1. 

In order to understand the invention and to see how it may be 
carried out in practice, a preferred embodiment will now be described, by 
way of nan- limiting exasiple only, with regard to the known garbage 
collector described by Doligez and Gonthier[4] and with reference to the 
accoaqpanying drawings, in which: 

Fig. 1 is a block diagram showing functionally a computer system 
for implementing the invention; 

Pig. 2 is a block diagram of an exemplary software environment for 
the computer system of Fig. 1, illustrating a collector thread according 
to the inventioni and 

Figs. 3 to 11 are flow charts showing the principal operating st^s 
associated with the collector thread of Fig. 2. 

Hardware Environment 

Fig. 1 shows a computer system depicted generally as 10 being part 
of a network 12 including one or more client ooniputer systems 14, IS and 
18 (e.g. desktop .or personal computers, workstations, etc.) coupled to a 
server ayatem 20. ihe network 12 may represent practically any type of 
networked interconnection, including but not limited to local-area, 
wide-area, wireless and public networks e.g. the Internet. Moreover, any 
nuiidber of computers and other devices may be networked through the 
xietwork 12, e.g. multiple servers. Alternatively, the principles of the 
inventitm may be equally well be in^lemented by standalone computers and 
associated devices consistent with the invention. 

the computer system 18, which may be similar to the con^uter 
systems 14, 16 and 20 may include one or more processors such as a 
microprocessor 21. oihere may further be included a nunber of peripheral 
devices such as a display monitor 22; storage devices 23 such as hard, 
floppy and/or CD-ROM disk drives; a printer 24; and various Input devices 
such as a mouse 26 and a keyboard 27. The con^uter system 18 operates 
under the control of an operating system and executes various con^uter 
software applications, programs, objects, modules* etc. Moreover, various 
applications, programs, objects, modules, etc. may also execute on one or 
more processors in the server system 20 or other computer systems 14 and ' 
16, e.g. in a distributed computing environment. 



In general, the routines executed to implement the illustrated 
embodiments of the invention, whether ia^lOTented as part of an operating 
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system or a specific application, program, bbject, module or seczuence of 
instructions will be referred to herein as "computer programs" • tlie 
computer programs typically comprise instructions which, ^en read and 
executed by one or more of the processors in the devices or systans in 
the computer systaa 10, cause those devices or aystma to perform the 
steps necessary to execute steps or elements embodying the various 
aspects of the invention » 

Software Environment 

Fig* 2 illustrates one suitable software environment for the 
computer system 18 consistent with the invention. The processor 21 is 
coupled to a memory 28 as well as to several tn^nta and outputs. A Java 
virtual Machine {JVM) execution module 30 is illustrated as resident in 
the memory 28 and is configured to execute program code on the processor 
21. Specifically, the JVM executes one or more program threads 32, as 
well as a collector thread 34 that is used to deallocate (or "free up") 
unused data stored in an object heap 36. The collector thread 34, ^ich 
is described in greater detail below with reference to Pigs. 3 to 11 of 
the. drawings, also uses a plurality of data structures 38 referred to 
generally as objects. The execution module 30 may be resident as a 
component of the operating system or of the computer system 18. 
Alternatively, it may be iji5>l«nented as a s^arate application that 
executes on top of an operating system. Furthermore, any of the execution 
module 30, program thread 32, collector thread 34, object heap 36 and 
collector data structures 38 may, at different times, be resident in 
whole or in part in any of the memory 28, mass storage 23, network 12, or 
within registers and/or caches in the processor 21. 

It should also be noted that the various software coaqponents may 
also be resident on, and may execute on, other computers coupled to the 
con5>uter system 10. Specifically, one particularly useful implementation 
of an execution module consistent with the invention is executed in a 
server such as an AS/400 midrange computer system from International 
Business Machines Corporation. 

Overview 

Figs. 3 to 11 show flow diagrams depicting the principal steps in a 
garbage collector according to the invention based on the model described 
by Doligez and Gdnthier[4]. Specifically, there will be described the 
modifications %diich must be made thereto, there being no need to describe 
in detail those aspects of the garbage collection algorithm %diich are 
fully detailed in the Doligez and Gonthier reference. 
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Before actually describing the specific changes that are proposed 
to the Doligez and Gonthier algorithm, the basic principles will first be 
detailed, to inpleaient the idea of exchanging colors « there are 
introduced two global variables, whiteColor and blac)cColor, which play 
the roles that were formerly played by White and Black. Rather than 
changing the color of reachable objects during sweeping, the values of 
these variables are exchanged. ^Iiat is, the color that indicated an 
object is marked during one collection cycle indicates that it is 
unmarked the next collection cycle. 

Along with this, objects are colored ugcai creation according to the 
value of a variable. The value of the variable bhanges once per 
collection cycle, from vidiiteColor to blackColor. There are two possible 
i]iV>lementations for this. One is for each program thread to use its own 
thread local variable. This implementation is isomorphic to the Doligez 
and Gonthier algorithm with regard to coloring during tracing. The other 
variation is to use a.glc^al variable. 

«E!here will now be. de8crijt>ed the required changeg to the pseudo-code 
of the Doligez and Gonthier collector. To implement the new coloring 
scheme means for the most part adjusting the pseudo-code ^dierever White 
or Black ajqpeared. Usually those constants are replaced by whiteColor and 
blackcolor, respectively. There are some exceptions to this such as in 
sweep and create. There are a few insert-ions to the code to maintain the 
new variables. The affected pseudo-code is detailed together with some 
explanatory comments. The functions given below in -pseudo -code and shown 
schanatically in the. Figures are exactly the same as those given by 
Doligez and Gonthier in (41 except for some inserted or changed lines 
which are marked in the Figures with an asterisk. 

Global Variables 

color [x] is the color of the object at address x on the heap. There 
are four colors: Color 1, Color2, Gray, and Blue. 

There are three global variables «Aich take the value of Colorl or 
Color2 : 

• WhiteColor and blackColor, which take over the rdles formerly 
played by White and Black. 

• allocationColor, which is used for initialization of the thread 
local allocColor [m] • 

Initially whiteColor and blackColor are opposed and allocationColor 
= blackColor. 
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lairead local variables 



There is also defined an additional variable local to each mutator 
thread, m, allocColor[ml , the color which that mutator assigns an object 
upon creation. 



AS in Doligez and Gonthier, the collector thread has associated 
therewith a status variable, status tcj, which takes one of three values: 
Syncl, Sync2, Async* Each mutator thread has a status field, status (ml, 
associated with it which can take on the same three values. The collector 
changes its status three times per collection cycle In a circular 
fashion, progressing from Async to Syncl to Sync2 and back to Async. When 
the collector changes its status, this serves as a signal to the mutator 
thr e a d s also to take on the succeeding status. 

Operation 

Pig. 3 shows the initialization of the collector thread wherein the 
values of whiteColor and blackColor and allocationColor are initialized. 
Fig* 4 shows the initialization of a mutator or program thread, wherein 
the allocation color variable, local to that thread, allocColor[ml , is 
assigned its initial value. In both Pigs. 3 and 4, there may be required 
other initializations in accordance with the Doligez and Gonthier 
algorithm and these remain unchanged. 



Fig. 5 shows the cooperate procedure which is executed at regular 
intervals by the mutator threads as in Doligez and Gonthier. Cooperate 
checks if the mutator thread's local status (m) is equal to the status [c] 
variable of the collector, if so, cooperate terminates, otherwise, if 
status [m] is currently equal to Sync2, then coc^erate calls HarkGray in 
order to shade each of the thread's local roots. Thereafter, the thread's 
allocColor [m] variable is set equal to the global allocation color 
variable, allocationColor. The mutator thread's local status [m] is then 
set to the status [c] variable of the collector. If status [m] were not 
equal to Sync2, then status [m] is set to status [c] of the collector, it 
is to be noted that the cooperate procedure is identical to that of 
Doligez and Gcmthier, except for the addition of the assignment of the 
allocColor [m] variable. The pseudo-code for the cooperate procedure is as 

follows: 

coc^rateO ( 
if (status [m] ^ status [c]) 
if (status (m] = Sync2) 

foreach x in {local roots of m) "do 
MarkGray (x); 
• alloc^lorfm) - allocationColor; 
status (ml « status [c] ; 



It will be, noted £rom tlie following description o£ the Kark stage 
shoNn in Fig. 9, that allocationColor is changed to blackColor 
imnediately prior to the collector thread initiating the handshake to 
bring the mutator threads to Async. Thus« in the cooperate procedure, 
during the transition £roai Sync2 to Async, allocationColor is equal to 
blackColor. It thus follows that the allocColor[m] variable is changed to 
blackColor immediately after the thread marte its local roots. By waitixig 
until this point, floating garbage is avoided. 

Fig. 6 shows the create protocol. First, memory is allocated for 
the new object. The color of the new object is then assigned the value of 
allocColor[m] . Thus, the pseudo-code reduces to: 

pick X e pool 

color [x] = allocColor [m] 

It is thus seen that no calculation is required to determine the 
color to assign to a newly allocated object, and that no synchronization 
with the sweep stage is required. 

Fig. 7 shows the collection cycle which consists of four stages: 
Clear, Mark, Scan and Sweep, shown in greater in Figs. 6 to 11, 
respectively. As noted in the figure, the sequence of the collection 
cycle remains the same as in the Doligez and Gonthier algorithm. The 
Clear stage shown in Pig. 7 acts to initialize the collection cycle. The 
Mark and Scan stages shown in Fig. 7 together constitute the mark phase 
of a mark-sweep collector and the Sweep stage shown in Fi?. 7 constitutes 
the sweep phase thereof. 

At the start of the collection cycle the values of whiteColor and 
blackColor are exchanged. All bbjects subject to collection are then 
whiteColor and all objects on the free list are blue. The color of a 
reachable object progresses from whiteColor to Gray to blackColor during 
the Mark and Scan stages of the collector. At the end of the Scan stage 
(Fig. 10), reachable objects are generally blackColor although, owing to 
the race condition in Doligez and Gonthier, some may be Gray. This 
however does not derogate from the correctness of the algorithm. The 
Sweep stage (Fig. 11) frees the ^iteColor cA>jects and changes then to 
Blue. 

Fig. 8 shows the Clear stage which initializes the collection 
cycle. The values of whiteColor and blackColor are exchanged and th en the 
collector executes handshake in order to move the mutator threads into 
Syncl. In this handshake the collector changes its status variable from 
Async to Syncl and then waits until each of the mutator threads has 
changed its status variable to Syncl. The exchange of whiteColor and 
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blacIcColor constitutes the distinction over the Doligez and Gonthier 
algorithm. The pseudo-code is as follows: 

clear O { 

// exchange ^iteColor and blackColor 

* int tenp = whitecolor; 

* whiteGolor = blackColor; 

* biackColor = teiqp; 
handshake (SYNCl) 

1 

Fig. 9 shows the Mark stage, \Ai±ch sets a variable, swept, which 
maps each object in the heap, to an initial value that is smaller than 
the corresponding value for the first c^ject in the heap. This signals to 
the write barrier that the Mark stage has commenced. The write barrier 
remains the same as shown by Doligez and Gonthier and is therefore not 
repeated here, rphe M&rk stage continues along two parallel threads. The 
first thread performs a handshake to bring the mutator threads to Svnc2. 
It then sets the global variable allocationColor to blackColor. This step 
distinguishes the first thread- from the Doligez and Gkmthier algorithm. 
Finally, the first thread then performs a second handshake to bring the 
mutator threads to Async. The second thread iterates through the global 
variables and traces the objects reached therefrom. The Trace procedure 
employed in Fig. 9 is the same as in Doligez and Gonthier, except that 
blac)cColor and whiteColor play the* roles of Black and white, 
respectively. The Mark stage terminates ^en both the first and second 
threads are coa^lete. The pseudo-code is as follows: 

mark () { 
swept = 'infinity 
cobegin 

handshake (syMC2) ; 
* allocationColor = blackColor; 
handshake ( ASYNC) ; 
and 

f oreach x in Globals do 
Trace (x) ; 

) 

Fig. 10 shows the Scan stage which completes the tracing of the 
reachable objects. It is identical to the Scan «tage in the Doligez' and - 
Gonthier algorithm, except that blackColor and whiteColor play the roles 
of Black and White, respectively. 

Fig. 11 shows the Sweep stage which initializes the variable, swept 
to zero,- this denoting the first object in the heap. For each object in 
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the heapf its color is examined* l£ equal to Gray, then its color is 
reset to blackColor. Otherwise, if its color is equal to «i^teColor# then 
it is reset to Blue and the object is £reed. If its color is neither Gray 
nor whiteColor, no action is taken, ihe variable swept is then 
incremented so as to cause the color of the next object in the heap to be 
examined. At the end of the procedure, the value of sw^t is set to 
infinity, a value idiich is guaranteed to be larger than the corresponding 
value for the last object in the heap, ^e pseudo*code for Sweep is as 
follows: 

swe^O 
sw^t « 0; 

while (swept < endLo^eap) do 
* if (color [swept] = Gray) 
« color [swvtl = blackColor; 

« else if (color [swept] — ^iteColor) 
color [swept] - Blue; 
appen<L.tGL^ree_li8t (swept) ; 
swept » sw^t 1; 
swept » -»>inf inity; 

The principal differoace between the above -described sweep stage 
and that of Doligez and CSonthier is that; in the invention, dbjects 
colored blacJcColor do not need to have their color reset to %^iteColor* 
These c^jects automatically become vdiiteColor at the start of the next 
collection cycle v^ien the roles of ^iteColor and blackColor are 
exchanged. 

There are three other functions defined by Doligez and Gonthier 
require amendm«cit, i.e. ifar)o6ray, MarlcAndwam and MarlcBlack. In 

each case, all occurrences of tlhite are replaced by n^iteColor and Black 

by blaclcColor, these being the cnly required changes. 

Alternative embodiment 

In the enbodiments so far described, a different local mutator 
thread variable was used in respect of each different mutator thread for 
assigning the color to new objects allocated by that thread. As an 
alternative, a single global variable may be employed in respect of the 
mutator threads, thxs simplifies the code sinpe the occurrence of 
allocColor[m] in the create protocol is simply r^laced by 
allocationColor and all code involving allocColor [m] is removed. 
The line allocatiQnColor»blac)cColor found in the Mark phase can actually 
occur any time after Handshake (Syncl) and before Handshake (Async) . It 
se^s logical to place it as late as possible (exactly where it is placed 
above) to minimize the amount of unreclaimable. garbage ^ 



z^lementatiom with a glc^al variable has the disadvantage o£- 
creating more floating gazt>age, and the advantage of being slightly 
sijBpler. Ihe amount of additional floating garbage caused by employing a 
single global allocation color is a function of the number of objects 
idiich a thread can create without noticing a change in the collector's 
status » 

It is thus seen that the invention gives a siinple and efficient 
method for implementing the color switch idea for the Doligez and 
Gonthier collector and for collectors sufficiently similar thereto. 

Specifically, the principles of the invention are equally suited to 
other concurrent garbage collectors running on shared menory using an 
algorithm characterized in thats 

a status variable is changed during the collection cycle, and 
actions taken by the program threads when updating and allocating an 
object may depend on that status. 

Likewise, the algorithm may be characterized in that: 
no coordination with the garbage collector is reouired ^dien a reference 
to an object in the hBap is added or updated to, or removed from, a 
mutator's stack. 

The algorithm may be further characterized in that: 
the collector thread has a status variable which only it can modify but 
which can be read by all the program threads, and 

each program thread has a respective status variable ^ich can be read by 
the collector thread. 

The garbage collector may have multiple collector threads. 

Further, whilst the invention has been described with pstrticular 
regard to separate collector and program threads, it should be noted that 
the invention is also applicable to the case that one or more program 
threads intermittently takes on the role of garbage collection. 

It will also be understood that the use of colors in the marking 
phase of a mark sweep garbage collector is arbitrary. The invention has 
been esqplained with regard to colors because this is the conventional 
terminology. However, any suitable attribute may be en^loyed to denote 
v^ether an object is marked, whether its descendants too are marked and - 
whether a memory location is free. 

The hardware as described makes particular reference to 
implementing the garbage collector within a complete computer system. 
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However, it will be appreciated that it can also be implemented in a 
read/write memory conq>onent which is sold separate from the cco^uter to 
^ich it is eventually coupled. 

Likewise, whilst in the preferred onbodiment a clear distinction 
has been made between the hardware and software functions of the garbage 
collector, in practice the functions carried out by software in the 
preferred odbodiment may be at least partially implemented in hardware as 
part of the mmory component. 

In the method claims which follow, alphabetic bharsLCters used to> 
designate claim steps are provided for convenience only and do not imply 
any particular order of performing the steps. 
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1. A con«mter- implemented method for eliminating synchronization 
between sweep and allocate in respect of a newly created object in a 
ooncurrent garbage collector for a heap iiQplanented In shared memory 
having mark and sweep sdiases, the method comprising the steps of: 

(a) in a first collection cycle, associating a first attribute 
with objects believed to be reachable and associating a second attribute 
with objects believed to be unreachable, 

(b) in a successive collection cycle, associating said first 
attribute with objects believed to be unreachable and associating said 
second attribute with objects believed to be reachable, and 

(c) repeating steps (a) and (b) for all successive cycles. 

2» ThB method according to Claim 1, lAerein the first and second 
attributes are colors. 

3. fEhe method according to Claim 2, vdierein the first and second 
colors are assigned using respective variables whose values are exchanged 
during alternate collection cycles. 

4. ^e method according to Claim 3, wherein said values are exchanged 
towards the beginning or the end of each collection cycle. 

5. The method according to Claim 1, further including the steps of: 

(d) employing a s^arate allocation value to mark newly allocated 
objects for each mutator thread, and 

(e) changing the allocation value at an appropriate point in the 
collection cycle. 

$. The method according to Claim 5, wherein the appropriate point in 
the collection pycle is during the mark phase of the collection cycle. 

7. The method according to Claim 5, wherein said appropriate point in 
the collection cycle is chosen so that each thread starts marking newly 
allocated objects as late as possible in the collection cycle thereby 
eliminating some floating garbage. 

8. The method according to Claim 7, wherein the allocation value is a 
color. 
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9. The method according to Claim 1, wherein the garbage collector is 
an on-the-f ly garbage collector. 

10. ^e method according to Claim 1, lAerein the garbage collector uses 
an algorithm characterized in that: 

a status variable is changed during the collection cycle, and 
actions taken by the program threads «Aien updating and allocating an 
object may depend on that status. 

11. The method according to Claim 1, wherein the algorithm is 
characterized in that: 

no coordination with the garbage collector is required when a 
reference to an object in the hew is added or updated to, or removed 
from, a mutator's stack. 

12. rChe method according to Claim 10, herein the algorithm is 
characterized in that: 

no coordination with the garbage collector is re<xuired when a 
reference to an object in the heap is added or updated to, or rmoved 
from, a mutator's stack. 

13. The me|Jiod according to Claim 11, %^erein the algorithm is further 
characterized in that: 

the collector thread has a status variable ^ich only it can modify 
but vdiich can be read by all the program threads, and 

each program thread has a respective status variable %Aiich can be 
read by the collector thread. 

14. The method according to Claim 1, v^erein the garbage collector has 
multiple collector threads. 

15. The method according to Claim 5, wherein each program thread • 
assigns to a new object an attribute whose value is stored in a 
respective allocation variable. 

16. The method according to Claim 5, vrtierein each program thread 
assigns to a new object an attribute whose value is stored in a global 
allocation variable. 

17. The method according to Claim 1, vrtierein one or more program 
threads intermittently takes on a role of garbage collection. 
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18* A memory component comprising: 

a heap impl^ented in shared memory, and 

a mark sweep collector configured to be executed on the heap to 
perform mark and sweep phases of a collecticn cycle, by: 

associating a first attribute with objects believed to be reachable 
in a first collection cycle, and for associating a second attribute with 
objects believed to be unreachable, and 

exchanging respective roles of the first and second attributes 
during alternate collection cycles. 

19. A computer system comprising the memory co mpon ent according to 
Claim 18, wherein: 

the first and second attributes are colors, and 

the memory conv>onent stores a value r^resentative of said colors. 

20. T*he con^uter system according to Claim 19, v^erein the first and 
second colors are assigned using respective variables whose values are 
exchanged during alternate collection cycles. 

21. rrhe con^uter systm according to Claim 19, wherein the memory 
component is further configured to: 

employ a s^Murate allocation value to mark newly allocated Objects 
for each program thread, and 

change the allocation value at an appropriate point in the mark 

phase. 

22. The computer systen according to Claim 21, wherein the m«nory 
component is configured to choose said appropriate point in the 
collection cycle so that each thread starts marking newly allocated 
objects as late as possible in the collection cycle thereby eliminating 
some floating garbage. 

23. The ccHi^uter system according to Claim 21« wherein: 
the allocation value is a color, and 

the manory component stores a value representative of said color* 



24. The computer system according to Claim 19, ixicluding at least two 
program t h re ads reading from and writing to the shared memory wherein: 

the memory is not sequentially consistent, and 

the mark sweep collector operates to avoid implicit synchronization 
which would otherwise be required. 

25. A program product, coaqprising: 

a program configured to perform a method for eliminating 
synchronization between swe^ and allocate in respect of a newly created 
object in a concurrent garbage collector for a heap implemented in shared 
mimory having marX and sweep phases, the method comprising; 

(a) in a first collecticm cycle, associating a first attribute 
with objects believed to be reachable and associating a second attribute 
with objects believed to be unreachable, 

(b) in a successive collection cycle, associating said first 
attribute with objects believed to be unreachable and associating said 
second attribute with objects believed to be reachable, and 

(c) r^eating steps (a) and (b) for all successive cycles. 

26. A computer systra comprising: 

a processor and a memory coupled thereto, 
a heap is^lemented in shared memory, and 

a mark sweep collector configured to be at least partially executed 
on the processor to perform mark and 3weep phases of a collection cycle, 
by: 

associating a first attribute with dbjects believed to be reachable 
in a first collection cycle, and for associating a second attribute with 
objects believed to be unreachable, and 

exchanging respective roles of the first and second attributes 
during alternate collection cycles. 

27. rfhe computer system according to Claim 26, wherein: 
the first and second attributes are colors, and 
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the memory component stores a value representative of said colors. 

28. The con5>uter system according to Claim 26, wherein the £irst and 
second colors are assigned using respective variables whose values are 
exchanged during alternate collection cycles. 

29. The computer system according to Claim 26, herein the processor is 
further configured to: 

employ a separate allocation value to mark newly allocated objects 
for each program thread, and 

change the allocation value at an appn^riate point in the mark 

phase. 

30. The computer system according to Claim 29, ^dierein the processor is 
configured to choose said appropriate point in the collection cycle so 
that each thread starts marking newly allocated objects as late as 
possible in the collection cycle thereby eliminating some floating 
geurbage. 

31. The computer system according to Claim 29, wherein: 
the allocation value is a color, and 

the memory stores a value representative of said color. 

32. The computer system according to Claim 26, including at least two 
program threads reading from and writing to the shared memory herein: 

the memory is not sequentially consistent, and 

the mark sweep collector c^erates to avoid implicit synchronization 
nAich would otherwise be required. 

33. The computer systen^ according to Claim 26, nAierein the mark sweep 
collector is at least pairtially implemented in said memory. 
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